Submitted:
18 May 2026
Posted:
19 May 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Existing CL Surveys and Remaining Gaps
CL
CL vs. Transfer Learning
CL vs. Multi-Task Learning
CL vs. Online Learning
Scope and Contribution of This Review
2.1. Setup
Basic Formulation of CL
3. Types of CL
Task-Incremental Learning
Applications of Task-Incremental Learning
Examples of Task Incremental Learning
Challenges
Domain-Incremental Learning
3.0.1. Examples of Domain Incremental Learning
Challenges
Class-Incremental Learning
Examples of Class Incremental Learning
Challenges in Class Incremental Learning
3.1. Data-Incremental Learning
- Incremental data arrival: The model receives data sequentially, one instance or batch at a time, without knowledge of whether the data introduces new classes or extends existing ones.
- No explicit task boundaries: Unlike task-based scenarios, data-incremental learning does not provide information about task transitions, requiring the model to infer patterns and adjust its learning dynamically.
- Challenges of catastrophic forgetting: As new data arrives, the model’s parameters may be updated in ways that overwrite knowledge of previously learned classes, leading to catastrophic forgetting.
- Adaptation and Generalization: The model must generalize well to new instances and classes while preserving accuracy on old ones, requiring a balance between plasticity (learning new data) and stability (retaining old knowledge).
Examples of Data-Incremental Learning
3.1.1. Challenges in Data-Incremental Learning
- Unstructured data streams: The lack of clear task boundaries increases the difficulty of organizing and processing data effectively.
- Memory constraints: Retaining past data or features for replay becomes resource-intensive as the volume of data grows.
- Class imbalance: Incrementally arriving data may introduce imbalanced class distributions, skewing the model’s performance.
- Replay mechanisms: Retaining a subset of past data or using generative models to recreate previous data for rehearsal as discussed in Section 6 in detail.
- Dynamic networks: Expanding model capacity incrementally to accommodate new data without overwriting existing knowledge.
- Regularization methods: Penalizing changes to parameters critical for previously learned data to mitigate forgetting as discussed in Section 6.
Other Emerging Paradigms in CL
Few-Shot CL
Unsupervised CL
Meta-CL
Federated CL
Multi-Agent CL
4. Theoretical Foundations of CL
Stability-Plasticity Dilemma

Catastrophic Forgetting
Forward and Backward Transfer
Representation Learning
Neuroscientific Motivation
Mathematical Frameworks
Regularization-Based Models
Replay and Memory Models
Dynamic Architectures
Bayesian Models
Information-Theoretic Models
5. The Catastrophic Forgetting Problem
Why Neural Networks Forget
Weight Updates and Parameter Drift
5.1. Factors Exacerbating Catastrophic Forgetting
5.2. Mitigation Strategies
6. Method Taxonomy in CL

Regularization-Based Methods

6.0.1. Elastic Weight Consolidation
6.0.2. Synaptic Intelligence and Related Methods
Replay-Based Methods

Experience Replay
Generative Replay
| Aspect | Experience Replay | Generative Replay |
|---|---|---|
| Memory usage | Stores selected raw samples or compressed examples. | Stores a generative model that synthesizes previous data. |
| Privacy | May be problematic when previous data are sensitive. | Avoids direct storage of raw samples but may still leak information if not properly controlled. |
| Replay quality | High fidelity because original samples are replayed. | Depends on generator quality, diversity, and label consistency. |
| Computational cost | Relatively low compared with training a generator. | Higher due to training and maintaining a generative model. |
| Best suited for | Class-incremental learning and reinforcement learning when memory is available. | Privacy-sensitive or memory-constrained settings where raw data cannot be stored. |
Architecture-Based Methods
Optimization-Based Methods
Representation-Learning Methods
Parameter-Efficient and Prompt-Based Methods
7. Evaluation Protocols, Benchmarks, and Metrics in CL
Benchmark Datasets
CL Evaluation Settings
Task Construction and Data Splits
Evaluation Metrics
7.0.1. Average Accuracy
Forgetting Measure
Forward Transfer
Backward Transfer
Memory and Computational Efficiency
Challenges in CL Evaluation
Toward Standardized Evaluation Protocols
8. Comparative Analysis of CL Methods
Overview of CL Method Categories
Regularization-Based Methods
Replay-Based Methods
Architecture-Based Methods
Optimization-Based Methods
Representation Learning Approaches
Prompt-Based and Parameter-Efficient CL
Comparison Across CL Settings
Memory, Scalability, and Computational Trade-Offs
9. Applications of CL
Applications in Healthcare and Medical Imaging
Applications in Robotics and Autonomous Systems
Application in Natural Language Processing
Recommender Systems
Cybersecurity
10. Open Challenges and Future Directions
Catastrophic Forgetting and Long-Term Knowledge Retention
Scalability and Realistic Streaming Benchmarks
Memory, Computation, and Deployment Constraints
CL for Foundation Models
Parameter-Efficient Continual Adaptation
Multimodal CL
Privacy-Preserving and Federated CL
CL in Medical Imaging Under Domain Shift
Ethical, Fairness, and Safety Considerations
Toward Robust Lifelong AI
Summary

| Challenge | Description | Impact | Examples | Potential Solutions |
|---|---|---|---|---|
|
Catastrophic Forgetting |
Overwriting of previous knowledge when learning new tasks. |
Loss of performance on earlier tasks, limiting multi-task applications. |
A model trained on new object classes forgets previously learned ones. |
Replay methods, regularization techniques (e.g., EWC, SI), parameter isolation (e.g., PNNs, PackNet). |
|
Scalability to Real-World Tasks |
Difficulty in handling diverse, undefined, and open-ended tasks found in real-world environments. |
Limits practical applications, especially in dynamic or multi- domain environments. |
A robot operating in a dynamic home environment fails to generalize across diverse tasks. |
Dynamic architectures (e.g., expandable networks), meta-learning, unsupervised task detection. |
|
Memory Constraints |
Storing data from previous tasks is often infeasible for large-scale or resource-limited applications. |
Limits model ability to effectively retain and replay past information. |
Replay-based methods requiring storage of vast datasets for continual adaptation. |
Efficient memory management techniques, synthetic replay using generative models, data pruning. |
|
Computational Overhead |
Increased computational demands for training and inference due to replay, regularization, or parameter isolation techniques. |
Hinders real-time applications on edge devices or systems with limited resources. |
On-device CL in IoT systems is slowed by high computational requirements. |
Lightweight models, parameter optimization, pruning, and efficient task-specific parameter allocation. |
|
Bias Amplification |
Sequential learning may reinforce biases present in earlier data or tasks. |
Skewed model behavior, disproportionately affecting certain demographic groups. |
A financial model favoring certain demographics due to biased historical data. |
Fairness-aware training, regular bias audits, diversity-focused data augmentation. |
|
Transparency and Explainability |
Models evolving continuously can become opaque, making their decision-making hard to interpret. |
Erodes trust, particularly in sensitive applications like healthcare or finance. |
Difficulty auditing a continually adapting medical diagnostic system. |
Explainability frameworks, interpretable architecture designs, and model debugging tools. |
|
Privacy Concerns |
Replay-based methods storing or processing user data may violate privacy regulations. |
Non-compliance with privacy laws (e.g., GDPR, HIPAA), leading to legal and ethical implications. |
Retaining user data for replay in recommendation systems could breach user consent. |
Privacy-preserving methods like federated learning, data anonymization, and synthetic data generation. |
|
Unintended Consequences |
Autonomous learning systems may exhibit behaviors or decisions not aligned with human intentions or societal norms. |
Potential safety risks, ethical conflicts, or misaligned system behavior in real-world scenarios. |
A self-learning robot adopts unsafe behaviors while optimizing a task autonomously. |
Strict behavioral constraints, ethical guidelines for autonomous systems, and robust oversight mechanisms during model deployment. |
11. Conclusion
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Azizi, A.; Zhang, Z.; Hua, W.; Li, M.; Igathinathane, C.; Yang, L.; Ampatzidis, Y.; Ghasemi-Varnamkhasti, M.; Zhang, M.; Li, H.; et al. Image processing and artificial intelligence for apple detection and localization: A comprehensive review. Comput. Sci. Rev. 2024, 54, 100690. [Google Scholar] [CrossRef]
- Annan, R.; Qingge, L. Artificial intelligence in COVID-19 research: A comprehensive survey of innovations, challenges, and future directions. Comput. Sci. Rev. 2025, 57, 100751. [Google Scholar] [CrossRef]
- Herrera, F. Reflections and attentiveness on eXplainable Artificial Intelligence (XAI). The journey ahead from criticisms to human–AI collaboration. Inf. Fusion 2025, 121, 103133. [Google Scholar]
- Utomo, S.; Pratap, A.; Karthikeyan, P.; Ayeelyan, J.; Hsu, H.C.; Hsiung, P.A. When explainable artificial intelligence meets data governance: Enhancing trustworthiness in multimodal gas classification. Inf. Fusion 2025, 103440. [Google Scholar] [CrossRef]
- Górriz, J.M.; Álvarez-Illán, I.; Álvarez-Marquina, A.; Arco, J.E.; Atzmueller, M.; Ballarini, F.; Barakova, E.; Bologna, G.; Bonomini, P.; Castellanos-Dominguez, G.; et al. Computational approaches to explainable artificial intelligence: advances in theory, applications and trends. Inf. Fusion 2023, 100, 101945. [Google Scholar] [CrossRef]
- Longo, L.; Brcic, M.; Cabitza, F.; Choi, J.; Confalonieri, R.; Del Ser, J.; Guidotti, R.; Hayashi, Y.; Herrera, F.; Holzinger, A.; et al. Explainable Artificial Intelligence (XAI) 2.0: A manifesto of open challenges and interdisciplinary research directions. Inf. Fusion 2024, 106, 102301. [Google Scholar] [CrossRef]
- Rezaee, K. Machine learning in automated diagnosis of autism spectrum disorder: A comprehensive review. Comput. Sci. Rev. 2025, 56, 100730. [Google Scholar] [CrossRef]
- Naser, M. From failure to fusion: A survey on learning from bad machine learning models. Inf. Fusion 2025, 120, 103122. [Google Scholar] [CrossRef]
- Escovedo, T.; Koshiyama, A.; da Cruz, A.A.; Vellasco, M. Neuroevolutionary learning in nonstationary environments. Appl. Intell. 2020, 50, 1590–1608. [Google Scholar] [CrossRef]
- Criado, M.F.; Casado, F.E.; Iglesias, R.; Regueiro, C.V.; Barro, S. Non-iid data and continual learning processes in federated learning: A long road ahead. Inf. Fusion 2022, 88, 263–280. [Google Scholar] [CrossRef]
- Nguyen, C.V.; Achille, A.; Lam, M.; Hassner, T.; Mahadevan, V.; Soatto, S. Toward understanding catastrophic forgetting in continual learning. arXiv 2019, arXiv:1908.01091. [Google Scholar] [CrossRef]
- ParisiGerman, I.; PartJose, L.; et al. Continual lifelong learning with neural networks. 2019. [Google Scholar] [CrossRef]
- Wang, L.; Zhang, X.; Su, H.; Zhu, J. A comprehensive survey of continual learning: Theory, method and application. In IEEE Transactions on Pattern Analysis and Machine Intelligence; 2024. [Google Scholar]
- Kirkpatrick, J.; Pascanu, R.; Rabinowitz, N.; Veness, J.; Desjardins, G.; Rusu, A.A.; Milan, K.; Quan, J.; Ramalho, T.; Grabska-Barwinska, A.; et al. Overcoming catastrophic forgetting in neural networks. Proc. Natl. Acad. Sci. 2017, 114, 3521–3526. [Google Scholar] [CrossRef] [PubMed]
- Xu, X.; Chen, J.; Thakur, D.; Hong, D. Multi-modal disease segmentation with continual learning and adaptive decision fusion. Inf. Fusion 2025, 102962. [Google Scholar]
- Wu, Y.; Li, Z.; Gao, Y.; Chiclana, F.; Chen, X.; Dong, Y. An endogenous and continual learning approach to personalize individual semantics to support linguistic consensus reaching. Inf. Fusion 2025, 114, 102640. [Google Scholar] [CrossRef]
- Yu, Y.; Du, Z.; Meng, L.; Li, J.; Hu, J. Adaptive online continual multi-view learning. Inf. Fusion 2024, 103, 102020. [Google Scholar] [CrossRef]
- Calvaresi, D.; Calbimonte, J.P. Real-time compliant stream processing agents for physical rehabilitation. Sensors 2020, 20, 746. [Google Scholar] [CrossRef]
- Shahrivari, S. Beyond batch processing: towards real-time and streaming big data. Computers 2014, 3, 117–129. [Google Scholar] [CrossRef]
- Parisi, G.I.; Lomonaco, V. Online continual learning on sequences. In Proceedings of the Recent Trends in Learning From Data: Tutorials from the INNS Big Data and Deep Learning Conference (INNSBDDL2019). Springer, 2020, pp. 197–221.
- Van de Ven, G.M.; Tuytelaars, T.; Tolias, A.S. Three types of incremental learning. Nat. Mach. Intell. 2022, 4, 1185–1197. [Google Scholar] [CrossRef]
- Bidaki, S.A.; Mohammadkhah, A.; Rezaee, K.; Hassani, F.; Eskandari, S.; Salahi, M.; Ghassemi, M.M. Online continual learning: A systematic literature review of approaches, challenges, and benchmarks. arXiv 2025, arXiv:2501.04897. [Google Scholar] [CrossRef]
- Zhou, D.W.; Wang, Q.W.; Qi, Z.H.; Ye, H.J.; Zhan, D.C.; Liu, Z. Class-incremental learning: A survey. In IEEE Transactions on Pattern Analysis and Machine Intelligence; 2024. [Google Scholar]
- Wickramasinghe, B.; Saha, G.; Roy, K. Continual learning: A review of techniques, challenges, and future directions. IEEE Trans. Artif. Intell. 2023, 5, 2526–2546. [Google Scholar] [CrossRef]
- Thrun, S.; Mitchell, T.M. Lifelong robot learning. Robot. Auton. Syst. 1995, 15, 25–46. [Google Scholar] [CrossRef]
- Tan, A.; Wang, Y.; Wu, W.Z.; Ding, W.; Liang, J. Multi-View Fusion Graph Attention Network for multilabel class incremental learning. Inf. Fusion 2025, 103309. [Google Scholar] [CrossRef]
- Li, D.; Wang, T.; Chen, J.; Kawaguchi, K.; Lian, C.; Zeng, Z. Multi-view class incremental learning. Inf. Fusion 2024, 102, 102021. [Google Scholar] [CrossRef]
- Zheng, Y.; Zhang, X.; Tian, Z.; Du, S. Enhancing few-shot lifelong learning through fusion of cross-domain knowledge. Inf. Fusion 2025, 115, 102730. [Google Scholar] [CrossRef]
- Mehta, S.V.; Patil, D.; Chandar, S.; Strubell, E. An empirical investigation of the role of pre-training in lifelong learning. J. Mach. Learn. Res. 2023, 24, 1–50. [Google Scholar]
- Kanakis, M.; Bruggemann, D.; Saha, S.; Georgoulis, S.; Obukhov, A.; Van Gool, L. Reparameterizing convolutions for incremental multi-task learning without task interference. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XX 16. Springer, 2020, pp. 689–707.
- Vödisch, N.; Cattaneo, D.; Burgard, W.; Valada, A. Covio: Online continual learning for visual-inertial odometry. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 2464–2473.
- Ullah, Z.; Usman, M.; Gwak, J. MTSS-AAE: Multi-task semi-supervised adversarial autoencoding for COVID-19 detection based on chest X-ray images. Expert Syst. With Appl. 2023, 216, 119475. [Google Scholar] [CrossRef]
- Bonicelli, L.; Boschini, M.; Frascaroli, E.; Porrello, A.; Pennisi, M.; Bellitto, G.; Palazzo, S.; Spampinato, C.; Calderara, S. On the effectiveness of equivariant regularization for robust online continual learning. arXiv 2023, arXiv:2305.03648. [Google Scholar] [CrossRef]
- Ali, S.; Abuhmed, T.; El-Sappagh, S.; Muhammad, K.; Alonso-Moral, J.M.; Confalonieri, R.; Guidotti, R.; Del Ser, J.; Díaz-Rodríguez, N.; Herrera, F. Explainable Artificial Intelligence (XAI): What we know and what is left to attain Trustworthy Artificial Intelligence. Inf. Fusion 2023, 99, 101805. [Google Scholar] [CrossRef]
- Abbass, H. What is artificial intelligence? IEEE Trans. Artif. Intell. 2021, 2, 94–95. [Google Scholar] [CrossRef]
- Smith, P.D. Hands-On Artificial Intelligence for Beginners: An introduction to AI concepts, algorithms, and their implementation; Packt Publishing Ltd, 2018. [Google Scholar]
- Chen, Z.; Liu, B. Lifelong machine learning; Morgan & Claypool Publishers, 2018. [Google Scholar]
- Parisi, G.I.; Kemker, R.; Part, J.L.; Kanan, C.; Wermter, S. Continual lifelong learning with neural networks: A review. Neural Netw. 2019, 113, 54–71. [Google Scholar] [CrossRef] [PubMed]
- Hayes, T.L.; Krishnan, G.P.; Bazhenov, M.; Siegelmann, H.T.; Sejnowski, T.J.; Kanan, C. Replay in deep learning: Current approaches and missing biological elements. Neural Comput. 2021, 33, 2908–2950. [Google Scholar] [CrossRef] [PubMed]
- Kudithipudi, D.; Aguilar-Simon, M.; Babb, J.; Bazhenov, M.; Blackiston, D.; Bongard, J.; Brna, A.P.; Chakravarthi Raja, S.; Cheney, N.; Clune, J.; et al. Biological underpinnings for lifelong learning machines. Nat. Mach. Intell. 2022, 4, 196–210. [Google Scholar] [CrossRef]
- Hadsell, R.; Rao, D.; Rusu, A.A.; Pascanu, R. Embracing change: Continual learning in deep neural networks. Trends Cogn. Sci. 2020, 24, 1028–1040. [Google Scholar] [CrossRef] [PubMed]
- Qu, H.; Rahmani, H.; Xu, L.; Williams, B.; Liu, J. Recent advances of continual learning in computer vision: An overview. IET Comput. Vis. 2025, 19, e70013. [Google Scholar] [CrossRef]
- Mai, Z.; Li, R.; Jeong, J.; Quispe, D.; Kim, H.; Sanner, S. Online continual learning in image classification: An empirical survey. Neurocomputing 2022, 469, 28–51. [Google Scholar] [CrossRef]
- Masana, M.; Twardowski, B.; Van de Weijer, J. On class orderings for incremental learning. arXiv 2020, arXiv:2007.02145. [Google Scholar] [CrossRef]
- Biesialska, M.; Biesialska, K.; Costa-Jussa, M.R. Continual lifelong learning in natural language processing: A survey. arXiv 2020, arXiv:2012.09823. [Google Scholar] [CrossRef]
- Ke, Z.; Liu, B. Continual learning of natural language processing tasks: A survey. arXiv 2022, arXiv:2211.12701. [Google Scholar]
- Khetarpal, K.; Riemer, M.; Rish, I.; Precup, D. Towards continual reinforcement learning: A review and perspectives. J. Artif. Intell. Res. 2022, 75, 1401–1476. [Google Scholar] [CrossRef]
- Ghosh, S. Dynamic vaes with generative replay for continual zero-shot learning. arXiv 2021, arXiv:2104.12468. [Google Scholar] [CrossRef]
- Singh, P.; Mazumder, P.; Rai, P.; Namboodiri, V.P. Rectification-based knowledge retention for continual learning. In Proceedings of the Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 15282–15291.
- Tao, X.; Hong, X.; Chang, X.; Dong, S.; Wei, X.; Gong, Y. Few-shot class-incremental learning. In Proceedings of the Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2020; pp. 12183–12192. [Google Scholar]
- Wang, L.; Yang, K.; Li, C.; Hong, L.; Li, Z.; Zhu, J. Ordisco: Effective and efficient usage of incremental unlabeled data for semi-supervised continual learning. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2021; pp. 5383–5392. [Google Scholar]
- Joseph, K.; Khan, S.; Khan, F.S.; Balasubramanian, V.N. Towards open world object detection. In Proceedings of the Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2021; pp. 5830–5840. [Google Scholar]
- Wang, Q.F.; Geng, X.; Lin, S.X.; Xia, S.Y.; Qi, L.; Xu, N. Learngene: From open-world to your learning task. Proc. Proc. AAAI Conf. Artif. Intell. 2022, Vol. 36, 8557–8565. [Google Scholar]
- Hu, D.; Yan, S.; Lu, Q.; Hong, L.; Hu, H.; Zhang, Y.; Li, Z.; Wang, X.; Feng, J. How well does self-supervised pre-training perform with streaming data? arXiv 2021, arXiv:2104.12081. [Google Scholar]
- Rao, D.; Visin, F.; Rusu, A.; Pascanu, R.; Teh, Y.W.; Hadsell, R. Continual unsupervised representation learning. Adv. Neural Inf. Process. Syst. 2019, 32. [Google Scholar]
- Ruvolo, P.; Eaton, E. ELLA: An efficient lifelong learning algorithm. In Proceedings of the International conference on machine learning. PMLR; 2013; pp. 507–515. [Google Scholar]
- Masse, N.Y.; Grant, G.D.; Freedman, D.J. Alleviating catastrophic forgetting using context-dependent gating and synaptic stabilization. Proc. Natl. Acad. Sci. 2018, 115, E10467–E10475. [Google Scholar] [CrossRef]
- Ramesh, R.; Chaudhari, P. Model zoo: A growing" brain" that learns continually. arXiv 2021, arXiv:2106.03027. [Google Scholar]
- PourKeshavarzi, M.; Zhao, G.; Sabokrou, M. Looking back on learned experiences for class/task incremental learning. In Proceedings of the International Conference on Learning Representations; 2021. [Google Scholar]
- Xie, X.; Xu, J.; Hu, P.; Zhang, W.; Huang, Y.; Zheng, W.; Wang, R. Task-incremental medical image classification with task-specific batch normalization. In Proceedings of the Chinese Conference on Pattern Recognition and Computer Vision (PRCV); 2023; Springer; pp. 309–320. [Google Scholar]
- Feng, F.; Chan, R.H.; Shi, X.; Zhang, Y.; She, Q. Challenges in task incremental learning for assistive robotics. IEEE Access 2019, 8, 3434–3441. [Google Scholar] [CrossRef]
- Lopez-Paz, D.; Ranzato, M. Gradient episodic memory for continual learning. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
- Vogelstein, J.T.; Dey, J.; Helm, H.S.; LeVine, W.; Mehta, R.D.; Tomita, T.M.; Xu, H.; Geisa, A.; Wang, Q.; van de Ven, G.M.; et al. A Simple Lifelong Learning Approach. arXiv 2020, arXiv:2004.12908. [Google Scholar]
- Ke, Z.; Liu, B.; Xu, H.; Shu, L. CLASSIC: Continual and contrastive learning of aspect sentiment classification tasks. arXiv 2021, arXiv:2112.02714. [Google Scholar] [CrossRef]
- Mirza, M.J.; Masana, M.; Possegger, H.; Bischof, H. An efficient domain-incremental learning approach to drive in all weather conditions. In Proceedings of the Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2022; pp. 3001–3011. [Google Scholar]
- Aljundi, R.; Chakravarty, P.; Tuytelaars, T. Expert gate: Lifelong learning with a network of experts. In Proceedings of the Proceedings of the IEEE conference on computer vision and pattern recognition; 2017; pp. 3366–3375. [Google Scholar]
- Von Oswald, J.; Henning, C.; Grewe, B.F.; Sacramento, J. Continual learning with hypernetworks. arXiv 2019, arXiv:1906.00695. [Google Scholar]
- Verma, V.K.; Liang, K.J.; Mehta, N.; Rai, P.; Carin, L. Efficient feature transformations for discriminative and generative continual learning. In Proceedings of the Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2021; pp. 13865–13875. [Google Scholar]
- Lomonaco, V.; Maltoni, D. Core50: a new dataset and benchmark for continuous object recognition. In Proceedings of the Conference on robot learning. PMLR; 2017; pp. 17–26. [Google Scholar]
- Garg, P.; Saluja, R.; Balasubramanian, V.N.; Arora, C.; Subramanian, A.; Jawahar, C. Multi-domain incremental learning for semantic segmentation. In Proceedings of the Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision; 2022; pp. 761–771. [Google Scholar]
- Capuano, N.; Greco, L.; Ritrovato, P.; Vento, M. Sentiment analysis for customer relationship management: an incremental learning approach. Appl. Intell. 2021, 51, 3339–3352. [Google Scholar] [CrossRef]
- Rebuffi, S.A.; Kolesnikov, A.; Sperl, G.; Lampert, C.H. icarl: Incremental classifier and representation learning. In Proceedings of the Proceedings of the IEEE conference on Computer Vision and Pattern Recognition; 2017; pp. 2001–2010. [Google Scholar]
- Shin, H.; Lee, J.K.; Kim, J.; Kim, J. Continual learning with deep generative replay. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
- Van de Ven, G.M.; Siegelmann, H.T.; Tolias, A.S. Brain-inspired replay for continual learning with artificial neural networks. Nat. Commun. 2020, 11, 4069. [Google Scholar] [CrossRef]
- Zhou, D.W.; Yang, Y.; Zhan, D.C. Learning to classify with incremental new class. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 2429–2443. [Google Scholar] [CrossRef]
- Belouadah, E.; Popescu, A.; Kanellos, I. A comprehensive study of class incremental learning algorithms for visual tasks. Neural Netw. 2021, 135, 38–54. [Google Scholar] [CrossRef]
- Masana, M.; Liu, X.; Twardowski, B.; Menta, M.; Bagdanov, A.D.; Van De Weijer, J. Class-incremental learning: survey and performance evaluation on image classification. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 5513–5533. [Google Scholar] [CrossRef]
- Channappayya, S.; Tamma, B.R.; et al. Augmented memory replay-based continual learning approaches for network intrusion detection. Adv. Neural Inf. Process. Syst. 2023, 36, 17156–17169. [Google Scholar]
- Li, X.; Wang, S.; Sun, J.; Xu, Z. Variational data-free knowledge distillation for continual learning. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 12618–12634. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25. [Google Scholar] [CrossRef]
- Park, J.; Kang, M.; Han, B. Class-incremental learning for action recognition in videos. In Proceedings of the Proceedings of the IEEE/CVF international conference on computer vision; 2021; pp. 13698–13707. [Google Scholar]
- Villa, A.; Alhamoud, K.; Escorcia, V.; Caba, F.; Alcázar, J.L.; Ghanem, B. vclimb: A novel video class incremental learning benchmark. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2022; pp. 19035–19044. [Google Scholar]
- Shmelkov, K.; Schmid, C.; Alahari, K. Incremental learning of object detectors without catastrophic forgetting. In Proceedings of the Proceedings of the IEEE international conference on computer vision; 2017; pp. 3400–3409. [Google Scholar]
- Girshick, R. Fast r-cnn. In Proceedings of the Proceedings of the IEEE international conference on computer vision; 2015; pp. 1440–1448. [Google Scholar]
- Ramakrishnan, K.; Panda, R.; Fan, Q.; Henning, J.; Oliva, A.; Feris, R. Relationship matters: Relation guided knowledge transfer for incremental learning of object detectors. In Proceedings of the Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops; 2020; pp. 250–251. [Google Scholar]
- Paik, I.; Oh, S.; Kwak, T.; Kim, I. Overcoming catastrophic forgetting by neuron-level plasticity control. Proc. Proc. AAAI Conf. Artif. Intell. 2020, Vol. 34, 5339–5346. [Google Scholar] [CrossRef]
- Zhou, X.; Wang, D.; Krähenbühl, P. Objects as points. arXiv 2019, arXiv:1904.07850. [Google Scholar]
- Li, D.; Tasci, S.; Ghosh, S.; Zhu, J.; Zhang, J.; Heck, L. RILOD: Near real-time incremental learning for object detection at the edge. In Proceedings of the Proceedings of the 4th ACM/IEEE Symposium on Edge Computing; 2019; pp. 113–126. [Google Scholar]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the Proceedings of the IEEE international conference on computer vision; 2017; pp. 2980–2988. [Google Scholar]
- Feng, T.; Wang, M.; Yuan, H. Overcoming catastrophic forgetting in incremental object detection via elastic response distillation. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2022; pp. 9427–9436. [Google Scholar]
- Li, X.; Wang, W.; Wu, L.; Chen, S.; Hu, X.; Li, J.; Tang, J.; Yang, J. Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection. Adv. Neural Inf. Process. Syst. 2020, 33, 21002–21012. [Google Scholar]
- Hao, Y.; Fu, Y.; Jiang, Y.G.; Tian, Q. An end-to-end architecture for class-incremental object detection with knowledge distillation. In Proceedings of the 2019 IEEE International Conference on Multimedia and Expo (ICME); IEEE, 2019; pp. 1–6. [Google Scholar]
- Peng, C.; Zhao, K.; Lovell, B.C. Faster ilod: Incremental learning for object detectors based on faster rcnn. Pattern Recognit. Lett. 2020, 140, 109–115. [Google Scholar] [CrossRef]
- Zhang, J.; Zhang, J.; Ghosh, S.; Li, D.; Tasci, S.; Heck, L.; Zhang, H.; Kuo, C.C.J. Class-incremental learning via deep model consolidation. In Proceedings of the Proceedings of the IEEE/CVF winter conference on applications of computer vision; 2020; pp. 1131–1140. [Google Scholar]
- Dong, N.; Zhang, Y.; Ding, M.; Lee, G.H. Bridging non co-occurrence with unlabeled in-the-wild data for incremental object detection. Adv. Neural Inf. Process. Syst. 2021, 34, 30492–30503. [Google Scholar]
- Joseph, K.; Rajasegaran, J.; Khan, S.; Khan, F.S.; Balasubramanian, V.N. Incremental object detection via meta-learning. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 9209–9216. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 2015, 28. [Google Scholar] [CrossRef]
- Zhao, N.; Lee, G.H. Static-dynamic co-teaching for class-incremental 3d object detection. Proc. Proc. AAAI Conf. Artif. Intell. 2022, Vol. 36, 3436–3445. [Google Scholar] [CrossRef]
- Wang, J.; Wang, X.; Shang-Guan, Y.; Gupta, A. Wanderlust: Online continual object detection in the real world. In Proceedings of the Proceedings of the IEEE/CVF international conference on computer vision; 2021; pp. 10829–10838. [Google Scholar]
- Perez-Rua, J.M.; Zhu, X.; Hospedales, T.M.; Xiang, T. Incremental few-shot object detection. In Proceedings of the Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2020; pp. 13846–13855. [Google Scholar]
- Feng, J.; Phillips, R.V.; Malenica, I.; Bishara, A.; Hubbard, A.E.; Celi, L.A.; Pirracchio, R. Clinical artificial intelligence quality improvement: towards continual monitoring and updating of AI algorithms in healthcare. npj Digit. Med. 2022, 5, 66. [Google Scholar] [CrossRef]
- Chrisley, R. Embodied artificial intelligence. Artif. Intell. 2003, 149, 131–150. [Google Scholar] [CrossRef]
- Duan, J.; Yu, S.; Tan, H.L.; Zhu, H.; Tan, C. A survey of embodied ai: From simulators to research tasks. IEEE Trans. Emerg. Top. Comput. Intell. 2022, 6, 230–244. [Google Scholar] [CrossRef]
- Franklin, S. Autonomous agents as embodied AI. Cybern. Syst. 1997, 28, 499–520. [Google Scholar] [CrossRef]
- Shi, G.; Wu, Y.; Liu, J.; Wan, S.; Wang, W.; Lu, T. Incremental few-shot semantic segmentation via embedding adaptive-update and hyper-class representation. In Proceedings of the Proceedings of the 30th ACM international conference on multimedia; 2022; pp. 5547–5556. [Google Scholar]
- Ganea, D.A.; Boom, B.; Poppe, R. Incremental few-shot instance segmentation. In Proceedings of the Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2021; pp. 1185–1194. [Google Scholar]
- Jin, X.; Lin, B.Y.; Rostami, M.; Ren, X. Learn continually, generalize rapidly: Lifelong knowledge accumulation for few-shot learning. arXiv 2021, arXiv:2104.08808. [Google Scholar]
- Cossu, A.; Carta, A.; Passaro, L.; Lomonaco, V.; Tuytelaars, T.; Bacciu, D. Continual pre-training mitigates forgetting in language and vision. Neural Netw. 2024, 179, 106492. [Google Scholar] [CrossRef]
- Madaan, D.; Yoon, J.; Li, Y.; Liu, Y.; Hwang, S.J. Representational continuity for unsupervised continual learning. arXiv 2021, arXiv:2110.06976. [Google Scholar]
- Riemer, M.; Cases, I.; Ajemian, R.; Liu, M.; Rish, I.; Tu, Y.; Tesauro, G. Learning to learn without forgetting by maximizing transfer and minimizing interference. arXiv 2018, arXiv:1810.11910. [Google Scholar]
- Guo, Q.; Zhao, W.; Lyu, Z.; Zhao, T. A GAN enhanced meta-deep reinforcement learning approach for DCN routing optimization. Inf. Fusion 2025, 121, 103160. [Google Scholar] [CrossRef]
- Zhao, Y.; Zhong, Z.; Yang, F.; Luo, Z.; Lin, Y.; Li, S.; Sebe, N. Learning to generalize unseen domains via memory-based multi-source meta-learning for person re-identification. In Proceedings of the Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2021; pp. 6277–6286. [Google Scholar]
- Javed, K.; White, M. Meta-learning representations for continual learning. Adv. Neural Inf. Process. Syst. 2019, 32. [Google Scholar]
- Beaulieu, S.; Frati, L.; Miconi, T.; Lehman, J.; Stanley, K.O.; Clune, J.; Cheney, N. Learning to continually learn. In ECAI 2020; IOS Press, 2020; pp. 992–1001. [Google Scholar]
- Lee, E.; Huang, C.H.; Lee, C.Y. Few-shot and continual learning with attentive independent mechanisms. In Proceedings of the Proceedings of the IEEE/CVF international conference on computer vision; 2021; pp. 9455–9464. [Google Scholar]
- Rajasegaran, J.; Khan, S.; Hayat, M.; Khan, F.S.; Shah, M. itaml: An incremental task-agnostic meta-learning approach. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2020; pp. 13588–13597. [Google Scholar]
- Gupta, G.; Yadav, K.; Paull, L. Look-ahead meta learning for continual learning. Adv. Neural Inf. Process. Syst. 2020, 33, 11588–11598. [Google Scholar]
- Caccia, L.; Belilovsky, E.; Caccia, M.; Pineau, J. Online learned continual compression with adaptive quantization modules. In Proceedings of the International conference on machine learning. PMLR; 2020; pp. 1240–1250. [Google Scholar]
- KJ, J.; N Balasubramanian, V. Meta-consolidation for continual learning. Adv. Neural Inf. Process. Syst. 2020, 33, 14374–14386. [Google Scholar]
- Henning, C.; Cervera, M.; D’Angelo, F.; Von Oswald, J.; Traber, R.; Ehret, B.; Kobayashi, S.; Grewe, B.F.; Sacramento, J. Posterior meta-replay for continual learning. Adv. Neural Inf. Process. Syst. 2021, 34, 14135–14149. [Google Scholar]
- Hurtado, J.; Raymond, A.; Soto, A. Optimizing reusable knowledge for continual learning via metalearning. Adv. Neural Inf. Process. Syst. 2021, 34, 14150–14162. [Google Scholar]
- Wang, R.; Bao, Y.; Zhang, B.; Liu, J.; Zhu, W.; Guo, G. Anti-retroactive interference for lifelong learning. In Proceedings of the European Conference on Computer Vision; 2022; Springer; pp. 163–178. [Google Scholar]
- McMahan, B.; Moore, E.; Ramage, D.; Hampson, S.; y Arcas, B.A. Communication-efficient learning of deep networks from decentralized data. In Proceedings of the Artificial intelligence and statistics. PMLR; 2017; pp. 1273–1282. [Google Scholar]
- Yoon, J.; Jeong, W.; Lee, G.; Yang, E.; Hwang, S.J. Federated continual learning with weighted inter-client transfer. In Proceedings of the International Conference on Machine Learning. PMLR; 2021; pp. 12073–12086. [Google Scholar]
- Usmanova, A.; Portet, F.; Lalanda, P.; Vega, G. A distillation-based approach integrating continual learning and federated learning for pervasive services. arXiv 2021, arXiv:2109.04197. [Google Scholar] [CrossRef]
- Park, T.J.; Kumatani, K.; Dimitriadis, D. Tackling dynamics in federated incremental learning with variational embedding rehearsal. arXiv 2021, arXiv:2110.09695. [Google Scholar] [CrossRef]
- Mermillod, M.; Bugaiska, A.; Bonin, P. The stability-plasticity dilemma: Investigating the continuum from catastrophic forgetting to age-limited learning effects. 2013. [Google Scholar] [CrossRef]
- Grossberg, S. Adaptive Resonance Theory: How a brain learns to consciously attend, learn, and recognize a changing world. Neural Netw. 2013, 37, 1–47. [Google Scholar] [CrossRef]
- Abraham, W.C.; Robins, A. Memory retention–the synaptic stability versus plasticity dilemma. Trends Neurosci. 2005, 28, 73–78. [Google Scholar] [CrossRef]
- Hebb, D.O. The organization of behavior: A neuropsychological theory; Psychology press, 2005. [Google Scholar]
- Power, J.D.; Schlaggar, B.L. Neural plasticity across the lifespan. Wiley Interdiscip. Rev. Dev. Biol. 2017, 6, e216. [Google Scholar] [CrossRef]
- McCloskey, M.; Cohen, N.J. Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of learning and motivation; Elsevier, 1989; Vol. 24, pp. 109–165. [Google Scholar]
- Ratcliff, R. Connectionist models of recognition memory: constraints imposed by learning and forgetting functions. Psychol. Rev. 1990, 97, 285. [Google Scholar] [CrossRef]
- Zhao, J.; Zhang, X.; Zhao, B.; Hu, W.; Diao, T.; Wang, L.; Zhong, Y.; Li, Q. Genetic dissection of mutual interference between two consecutive learning tasks in Drosophila. Elife 2023, 12, e83516. [Google Scholar] [CrossRef]
- Hayashi-Takagi, A.; Yagishita, S.; Nakamura, M.; Shirai, F.; Wu, Y.I.; Loshbaugh, A.L.; Kuhlman, B.; Hahn, K.M.; Kasai, H. Labelling and optical erasure of synaptic memory traces in the motor cortex. Nature 2015, 525, 333–338. [Google Scholar] [CrossRef] [PubMed]
- Yang, G.; Pan, F.; Gan, W.B. Stably maintained dendritic spines are associated with lifelong memories. Nature 2009, 462, 920–924. [Google Scholar] [CrossRef] [PubMed]
- Zhang, X.; Li, Q.; Wang, L.; Liu, Z.J.; Zhong, Y. Active protection: learning-activated Raf/MAPK activity protects labile memory from Rac1-independent forgetting. Neuron 2018, 98, 142–155. [Google Scholar] [CrossRef]
- Huszár, F. Note on the quadratic penalties in elastic weight consolidation. Proc. Natl. Acad. Sci. 2018, 115, E2496–E2497. [Google Scholar] [CrossRef] [PubMed]
- McNaughton, B.L.; O’Reilly, R.C. Why there are complementary learning systems in the hippocampus and neocortex: Insights from the successes and failures of. Psychol. Rev. 1995, 102, 419–457. [Google Scholar] [CrossRef]
- Graves, L.; Nagisetty, V.; Ganesh, V. Does AI remember? neural networks and the right to be forgotten. In Neural Networks and the Right to be Forgotten; 2020. [Google Scholar]
- Ding, M.; Ji, K.; Wang, D.; Xu, J. Understanding forgetting in continual learning with linear regression. arXiv 2024, arXiv:2405.17583. [Google Scholar] [CrossRef]
- Aljundi, R.; Lin, M.; Goujaud, B.; Bengio, Y. Gradient based sample selection for online continual learning. Adv. Neural Inf. Process. Syst. 2019, 32. [Google Scholar]
- Ritter, H.; Botev, A.; Barber, D. Online structured laplace approximations for overcoming catastrophic forgetting. Adv. Neural Inf. Process. Syst. 2018, 31. [Google Scholar]
- Schwarz, J.; Czarnecki, W.; Luketina, J.; Grabska-Barwinska, A.; Teh, Y.W.; Pascanu, R.; Hadsell, R. Progress & compress: A scalable framework for continual learning. In Proceedings of the International conference on machine learning. PMLR; 2018; pp. 4528–4537. [Google Scholar]
- Gou, J.; Yu, B.; Maybank, S.J.; Tao, D. Knowledge distillation: A survey. Int. J. Comput. Vis. 2021, 129, 1789–1819. [Google Scholar] [CrossRef]
- Dhar, P.; Singh, R.V.; Peng, K.C.; Wu, Z.; Chellappa, R. Learning without memorizing. In Proceedings of the Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2019; pp. 5138–5146. [Google Scholar]
- Iscen, A.; Zhang, J.; Lazebnik, S.; Schmid, C. Memory-efficient incremental learning through feature adaptation. In Proceedings of the European conference on computer vision; 2020; Springer; pp. 699–715. [Google Scholar]
- Li, Z.; Hoiem, D. Learning without forgetting. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 2935–2947. [Google Scholar] [CrossRef]
- Castro, F.M.; Marín-Jiménez, M.J.; Guil, N.; Schmid, C.; Alahari, K. End-to-end incremental learning. In Proceedings of the Proceedings of the European conference on computer vision (ECCV); 2018; pp. 233–248. [Google Scholar]
- Douillard, A.; Cord, M.; Ollion, C.; Robert, T.; Valle, E. Podnet: Pooled outputs distillation for small-tasks incremental learning. In Proceedings of the Computer vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, proceedings, part XX 16. Springer, 2020, pp. 86–102.
- Hou, S.; Pan, X.; Loy, C.C.; Wang, Z.; Lin, D. Learning a unified classifier incrementally via rebalancing. In Proceedings of the Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2019; pp. 831–839. [Google Scholar]
- Wu, C.; Herranz, L.; Liu, X.; Van De Weijer, J.; Raducanu, B.; et al. Memory replay gans: Learning to generate new categories without forgetting. Adv. Neural Inf. Process. Syst. 2018, 31. [Google Scholar]
- Zhai, M.; Chen, L.; Tung, F.; He, J.; Nawhal, M.; Mori, G. Lifelong gan: Continual learning for conditional image generation. In Proceedings of the Proceedings of the IEEE/CVF international conference on computer vision; 2019; pp. 2759–2768. [Google Scholar]
- Liu, X.; Masana, M.; Herranz, L.; Van de Weijer, J.; Lopez, A.M.; Bagdanov, A.D. Rotate your networks: Better weight consolidation and less catastrophic forgetting. In Proceedings of the 2018 24th international conference on pattern recognition (ICPR); IEEE, 2018; pp. 2262–2268. [Google Scholar]
- Benzing, F. Unifying importance based regularisation methods for continual learning. In Proceedings of the International Conference on Artificial Intelligence and Statistics. PMLR; 2022; pp. 2372–2396. [Google Scholar]
- Lee, S.W.; Kim, J.H.; Jun, J.; Ha, J.W.; Zhang, B.T. Overcoming catastrophic forgetting by incremental moment matching. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
- Chaudhry, A.; Dokania, P.K.; Ajanthan, T.; Torr, P.H. Riemannian walk for incremental learning: Understanding forgetting and intransigence. In Proceedings of the Proceedings of the European conference on computer vision (ECCV); 2018; pp. 532–547. [Google Scholar]
- Aljundi, R.; Babiloni, F.; Elhoseiny, M.; Rohrbach, M.; Tuytelaars, T. Memory aware synapses: Learning what (not) to forget. In Proceedings of the Proceedings of the European conference on computer vision (ECCV); 2018; pp. 139–154. [Google Scholar]
- Nguyen, C.V.; Li, Y.; Bui, T.D.; Turner, R.E. Variational continual learning. arXiv 2017, arXiv:1710.10628. [Google Scholar]
- Chaudhry, A.; Rohrbach, M.; Elhoseiny, M.; Ajanthan, T.; Dokania, P.K.; Torr, P.H.; Ranzato, M. On tiny episodic memories in continual learning. arXiv 2019, arXiv:1902.10486. [Google Scholar] [CrossRef]
- Vitter, J.S. Random sampling with a reservoir. ACM Trans. Math. Softw. (TOMS) 1985, 11, 37–57. [Google Scholar] [CrossRef]
- Borsos, Z.; Mutny, M.; Krause, A. Coresets via bilevel optimization for continual learning and streaming. Adv. Neural Inf. Process. Syst. 2020, 33, 14879–14890. [Google Scholar]
- Yoon, J.; Madaan, D.; Yang, E.; Hwang, S.J. Online coreset selection for rehearsal-based continual learning. arXiv 2021, arXiv:2106.01085. [Google Scholar]
- Shim, D.; Mai, Z.; Jeong, J.; Sanner, S.; Kim, H.; Jang, J. Online class-incremental continual learning with adversarial shapley value. Proc. Proc. AAAI Conf. Artif. Intell. 2021, Vol. 35, 9630–9638. [Google Scholar]
- Bang, J.; Kim, H.; Yoo, Y.; Ha, J.W.; Choi, J. Rainbow memory: Continual learning with a memory of diverse samples. In Proceedings of the Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2021; pp. 8218–8227. [Google Scholar]
- Tiwari, R.; Killamsetty, K.; Iyer, R.; Shenoy, P. Gcr: Gradient coreset based replay buffer selection for continual learning. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2022; pp. 99–108. [Google Scholar]
- Van Den Oord, A.; Vinyals, O.; et al. Neural discrete representation learning. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
- Wang, L.; Zhang, X.; Yang, K.; Yu, L.; Li, C.; Hong, L.; Zhang, S.; Li, Z.; Zhong, Y.; Zhu, J. Memory replay with data compression for continual learning. arXiv 2022, arXiv:2202.06592. [Google Scholar] [CrossRef]
- Kulesza, A.; Taskar, B.; et al. Determinantal point processes for machine learning. Found. Trends Mach. Learn. 2012, 5, 123–286. [Google Scholar]
- Kumari, L.; Wang, S.; Zhou, T.; Bilmes, J.A. Retrospective adversarial replay for continual learning. Adv. Neural Inf. Process. Syst. 2022, 35, 28530–28544. [Google Scholar]
- Zhang, H.; Cisse, M.; Dauphin, Y.N.; Lopez-Paz, D. mixup: Beyond empirical risk minimization. arXiv 2017, arXiv:1710.09412. [Google Scholar]
- Belouadah, E.; Popescu, A. Il2m: Class incremental learning with dual memory. In Proceedings of the Proceedings of the IEEE/CVF international conference on computer vision; 2019; pp. 583–592. [Google Scholar]
- Ebrahimi, S.; Petryk, S.; Gokul, A.; Gan, W.; Gonzalez, J.E.; Rohrbach, M.; Darrell, T. Remembering for the right reasons: Explanations reduce catastrophic forgetting. Appl. AI Lett. 2021, 2, e44. [Google Scholar] [CrossRef]
- Liu, Y.; Su, Y.; Liu, A.A.; Schiele, B.; Sun, Q. Mnemonics training: Multi-class incremental learning without forgetting. In Proceedings of the Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition; 2020; pp. 12245–12254. [Google Scholar]
- Jin, X.; Sadhu, A.; Du, J.; Ren, X. Gradient-based editing of memory examples for online task-free continual learning. Adv. Neural Inf. Process. Syst. 2021, 34, 29193–29205. [Google Scholar]
- Chaudhry, A.; Ranzato, M.; Rohrbach, M.; Elhoseiny, M. Efficient lifelong learning with a-gem. arXiv 2018, arXiv:1812.00420. [Google Scholar]
- Tang, S.; Chen, D.; Zhu, J.; Yu, S.; Ouyang, W. Layerwise optimization by gradient decomposition for continual learning. In Proceedings of the Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition; 2021; pp. 9634–9643. [Google Scholar]
- Sun, Q.; Lyu, F.; Shang, F.; Feng, W.; Wan, L. Exploring example influence in continual learning. Adv. Neural Inf. Process. Syst. 2022, 35, 27075–27086. [Google Scholar]
- Aljundi, R.; Belilovsky, E.; Tuytelaars, T.; Charlin, L.; Caccia, M.; Lin, M.; Page-Caccia, L. Online continual learning with maximal interfered retrieval. Adv. Neural Inf. Process. Syst. 2019, 32. [Google Scholar]
- Chaudhry, A.; Gordo, A.; Dokania, P.; Torr, P.; Lopez-Paz, D. Using hindsight to anchor past knowledge in continual learning. Proc. Proc. AAAI Conf. Artif. Intell. 2021, Vol. 35, 6993–7001. [Google Scholar] [CrossRef]
- Wu, Y.; Chen, Y.; Wang, L.; Ye, Y.; Liu, Z.; Guo, Y.; Fu, Y. Large scale incremental learning. In Proceedings of the Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2019; pp. 374–382. [Google Scholar]
- Zhao, B.; Xiao, X.; Gan, G.; Zhang, B.; Xia, S.T. Maintaining discrimination and fairness in class incremental learning. In Proceedings of the Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2020; pp. 13208–13217. [Google Scholar]
- Ahn, H.; Kwak, J.; Lim, S.; Bang, H.; Kim, H.; Moon, T. Ss-il: Separated softmax for incremental learning. In Proceedings of the Proceedings of the IEEE/CVF International conference on computer vision; 2021; pp. 844–853. [Google Scholar]
- Cha, H.; Lee, J.; Shin, J. Co2l: Contrastive continual learning. In Proceedings of the Proceedings of the IEEE/CVF International conference on computer vision; 2021; pp. 9516–9525. [Google Scholar]
- Simon, C.; Koniusz, P.; Harandi, M. On learning the geodesic path for incremental learning. In Proceedings of the Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition; 2021; pp. 1591–1600. [Google Scholar]
- Joseph, K.; Khan, S.; Khan, F.S.; Anwer, R.M.; Balasubramanian, V.N. Energy-based latent aligner for incremental learning. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2022; pp. 7452–7461. [Google Scholar]
- Kurmi, V.K.; Patro, B.N.; Subramanian, V.K.; Namboodiri, V.P. Do not forget to attend to uncertainty while mitigating catastrophic forgetting. In Proceedings of the Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision; 2021; pp. 736–745. [Google Scholar]
- Ashok, A.; Joseph, K.; Balasubramanian, V.N. Class-incremental learning with cross-space clustering and controlled transfer. In Proceedings of the European conference on computer vision; 2022; Springer; pp. 105–122. [Google Scholar]
- Hu, X.; Tang, K.; Miao, C.; Hua, X.S.; Zhang, H. Distilling causal effect of data in class-incremental learning. In Proceedings of the Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition; 2021; pp. 3957–3966. [Google Scholar]
- Bhat, P.; Zonooz, B.; Arani, E. Task-aware information routing from common representation space in lifelong learning. arXiv 2023, arXiv:2302.11346. [Google Scholar] [CrossRef]
- Hou, S.; Pan, X.; Loy, C.C.; Wang, Z.; Lin, D. Lifelong learning via progressive distillation and retrospection. In Proceedings of the Proceedings of the European Conference on Computer Vision (ECCV); 2018; pp. 437–452. [Google Scholar]
- Wang, F.Y.; Zhou, D.W.; Ye, H.J.; Zhan, D.C. Foster: Feature boosting and compression for class-incremental learning. In Proceedings of the European conference on computer vision; 2022; Springer; pp. 398–414. [Google Scholar]
- Wang, L.; Zhang, M.; Jia, Z.; Li, Q.; Bao, C.; Ma, K.; Zhu, J.; Zhong, Y. Afec: Active forgetting of negative transfer in continual learning. Adv. Neural Inf. Process. Syst. 2021, 34, 22379–22391. [Google Scholar]
- Verwimp, E.; De Lange, M.; Tuytelaars, T. Rehearsal revealed: The limits and merits of revisiting samples in continual learning. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision; 2021; pp. 9385–9394. [Google Scholar]
- Bonicelli, L.; Boschini, M.; Porrello, A.; Spampinato, C.; Calderara, S. On the effectiveness of lipschitz-driven rehearsal in continual learning. Adv. Neural Inf. Process. Syst. 2022, 35, 31886–31901. [Google Scholar]
- Yu, L.; Hu, T.; Hong, L.; Liu, Z.; Weller, A.; Liu, W. Continual learning by modeling intra-class variation. arXiv 2022, arXiv:2210.05398. [Google Scholar]
- Buzzega, P.; Boschini, M.; Porrello, A.; Abati, D.; Calderara, S. Dark experience for general continual learning: a strong, simple baseline. Adv. Neural Inf. Process. Syst. 2020, 33, 15920–15930. [Google Scholar]
- Boschini, M.; Bonicelli, L.; Buzzega, P.; Porrello, A.; Calderara, S. Class-incremental continual learning into the extended der-verse. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 5497–5512. [Google Scholar] [CrossRef]
- Prabhu, A.; Torr, P.H.; Dokania, P.K. Gdumb: A simple approach that questions our progress in continual learning. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16. Springer, 2020, pp. 524–540.
- Ayub, A.; Wagner, A.R. EEC: Learning to encode and regenerate images for continual learning. arXiv 2021, arXiv:2101.04904. [Google Scholar] [CrossRef]
- Ostapenko, O.; Puscas, M.; Klein, T.; Jahnichen, P.; Nabi, M. Learning to remember: A synaptic plasticity driven framework for continual learning. In Proceedings of the Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2019; pp. 11321–11329. [Google Scholar]
- Kemker, R.; Kanan, C. Fearnet: Brain-inspired model for incremental learning. arXiv 2017, arXiv:1711.10563. [Google Scholar]
- Riemer, M.; Klinger, T.; Bouneffouf, D.; Franceschini, M. Scalable recollections for continual lifelong learning. Proc. Proc. AAAI Conf. Artif. Intell. 2019, Vol. 33, 1352–1359. [Google Scholar] [CrossRef]
- Rostami, M.; Kolouri, S.; Pilly, P.K. Complementary learning for overcoming catastrophic forgetting using experience replay. arXiv 2019, arXiv:1903.04566. [Google Scholar] [CrossRef]
- Pfülb, B.; Gepperth, A.; Bagus, B. Continual learning with fully probabilistic models. arXiv 2021, arXiv:2104.09240. [Google Scholar] [CrossRef]
- Gopalakrishnan, S.; Singh, P.R.; Fayek, H.; Ramasamy, S.; Ambikapathi, A. Knowledge capture and replay for continual learning. In Proceedings of the Proceedings of the IEEE/CVF winter conference on applications of computer vision; 2022; pp. 10–18. [Google Scholar]
- Ye, F.; Bors, A.G. Learning latent representations across multiple data domains using lifelong VAEGAN. In Proceedings of the European Conference on Computer Vision; 2020; Springer; pp. 777–795. [Google Scholar]
- Seff, A.; Beatson, A.; Suo, D.; Liu, H. Continual learning in generative adversarial nets. arXiv 2017, arXiv:1705.08395. [Google Scholar] [CrossRef]
- He, C.; Wang, R.; Shan, S.; Chen, X. Exemplar-supported generative reproduction for class incremental learning. Proc. BMVC 2018, Vol. 1, 2. [Google Scholar]
- Xiang, Y.; Fu, Y.; Ji, P.; Huang, H. Incremental learning using conditional adversarial networks. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision; 2019; pp. 6619–6628. [Google Scholar]
- Cong, Y.; Zhao, M.; Li, J.; Wang, S.; Carin, L. Gan memory with no forgetting. Adv. Neural Inf. Process. Syst. 2020, 33, 16481–16494. [Google Scholar]
- Liu, X.; Wu, C.; Menta, M.; Herranz, L.; Raducanu, B.; Bagdanov, A.D.; Jui, S.; de Weijer, J.v. Generative feature replay for class-incremental learning. In Proceedings of the Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops; 2020; pp. 226–227. [Google Scholar]
- Ostapenko, O.; Lesort, T.; Rodriguez, P.; Arefin, M.R.; Douillard, A.; Rish, I.; Charlin, L. Continual learning with foundation models: An empirical study of latent replay. In Proceedings of the Conference on lifelong learning agents. PMLR; 2022; pp. 60–91. [Google Scholar]
- Wang, Z.; Liu, L.; Duan, Y.; Tao, D. Continual learning through retrieval and imagination. Proc. Proc. AAAI Conf. Artif. Intell. 2022, Vol. 36, 8594–8602. [Google Scholar] [CrossRef]
- Wang, Z.; Liu, L.; Kong, Y.; Guo, J.; Tao, D. Online continual learning with contrastive vision transformer. In Proceedings of the European Conference on Computer Vision; 2022; Springer; pp. 631–650. [Google Scholar]
- Wang, Y.; Huang, Z.; Hong, X. S-prompts learning with pre-trained transformers: An occam’s razor for domain incremental learning. Adv. Neural Inf. Process. Syst. 2022, 35, 5682–5695. [Google Scholar]
- Wang, Z.; Zhang, Z.; Ebrahimi, S.; Sun, R.; Zhang, H.; Lee, C.Y.; Ren, X.; Su, G.; Perot, V.; Dy, J.; et al. Dualprompt: Complementary prompting for rehearsal-free continual learning. In Proceedings of the European conference on computer vision; 2022; Springer; pp. 631–648. [Google Scholar]
- Park, C.W.; Seo, S.W.; Kang, N.; Ko, B.; Choi, B.W.; Park, C.M.; Chang, D.K.; Kim, H.; Kim, H.; Lee, H.; et al. Artificial intelligence in health care: current applications and issues. J. Korean Med. Sci. 2020, 35. [Google Scholar] [CrossRef] [PubMed]
- Zhu, D.; Bu, Q.; Zhu, Z.; Zhang, Y.; Wang, Z. Advancing autonomy through lifelong learning: a survey of autonomous intelligent systems. Front. Neurorobotics 2024, 18, 1385778. [Google Scholar] [CrossRef]
- Ciupek, D.; Malawski, M.; Pieciak, T. Federated Learning: A new frontier in the exploration of multi-institutional medical imaging data. arXiv 2025, arXiv:2503.20107. [Google Scholar]
- Thakur, G.K.; Thakur, A.; Kulkarni, S.; Khan, N.; Khan, S. Deep learning approaches for medical image analysis and diagnosis. Cureus 2024, 16. [Google Scholar] [CrossRef]
- Jeon, J.; Kim, J.; Kim, J.; Kim, K.; Mohaisen, A.; Kim, J.K. Privacy-preserving deep learning computation for geo-distributed medical big-data platforms. In Proceedings of the 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks–Supplemental Volume (DSN-S); IEEE, 2019; pp. 3–4. [Google Scholar]
- Pianykh, O.S.; Langs, G.; Dewey, M.; Enzmann, D.R.; Herold, C.J.; Schoenberg, S.O.; Brink, J.A. Continuous learning AI in radiology: implementation principles and early applications. Radiology 2020, 297, 6–14. [Google Scholar] [CrossRef] [PubMed]
- Pinto-Coelho, L. How artificial intelligence is shaping medical imaging technology: a survey of innovations and applications. Bioengineering 2023, 10, 1435. [Google Scholar] [CrossRef]
- Zhu, Z.; Sun, Y.; Honarvar Shakibaei; Asli, B. Early Breast Cancer Detection Using Artificial Intelligence Techniques Based on Advanced Image Processing Tools. Electronics 2024, 13, 3575. [Google Scholar] [CrossRef]
- da Silva Motta, D.; Badaró, R.; Santos, A.; Kirchner, F. Use of artificial intelligence on the control of vector-borne diseases; IntechOpen, 2018. [Google Scholar]
- Dasari, S.; Ebert, F.; Tian, S.; Nair, S.; Bucher, B.; Schmeckpeper, K.; Singh, S.; Levine, S.; Finn, C. Robonet: Large-scale multi-robot learning. arXiv 2019, arXiv:1910.11215. [Google Scholar]
- Haque, N. Catastrophic Forgetting in LLMs: A Comparative Analysis Across Language Tasks. arXiv 2025, arXiv:2504.01241. [Google Scholar] [CrossRef]
- Yao, Y.; González-Vélez, H. AI-Powered System to Facilitate Personalized Adaptive Learning in Digital Transformation. Appl. Sci. 2025, 15, 4989. [Google Scholar] [CrossRef]
- Li, D.; Chen, Z.; Cho, E.; Hao, J.; Liu, X.; Xing, F.; Guo, C.; Liu, Y. Overcoming catastrophic forgetting during domain adaptation of seq2seq language generation. In Proceedings of the Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; 2022; pp. 5441–5454. [Google Scholar]
- Liu, T.; Ungar, L.; Sedoc, J. Continual learning for sentence representations using conceptors. arXiv 2019, arXiv:1904.09187. [Google Scholar] [CrossRef]
- Monaikul, N.; Castellucci, G.; Filice, S.; Rokhlenko, O. Continual learning for named entity recognition. Proc. Proc. AAAI Conf. Artif. Intell. 2021, Vol. 35, 13570–13577. [Google Scholar] [CrossRef]
- Li, G.; Zhai, Y.; Chen, Q.; Gao, X.; Zhang, J.; Zhang, Y. Continual few-shot intent detection. In Proceedings of the Proceedings of the 29th international conference on computational linguistics; 2022; pp. 333–343. [Google Scholar]
- Liu, Q.; Yu, X.; He, S.; Liu, K.; Zhao, J. Lifelong intent detection via multi-strategy rebalancing. arXiv 2021, arXiv:2108.04445. [Google Scholar] [CrossRef]
- Varshney, V.; Patidar, M.; Kumar, R.; Shroff, G.; Vig, L. Prompt augmented generative replay via supervised contrastive training for lifelong intent detection, 2024. US Patent App. 18/215,972.
- Qin, C.; Joty, S. Lfpt5: A unified framework for lifelong few-shot language learning based on prompt tuning of t5. arXiv 2021, arXiv:2110.07298. [Google Scholar]
- Sun, J.; Wang, S.; Zhang, J.; Zong, C. Distill and replay for continual language learning. In Proceedings of the Proceedings of the 28th international conference on computational linguistics; 2020; pp. 3569–3579. [Google Scholar]
- Cao, Y.; Wei, H.R.; Chen, B.; Wan, X. Continual learning for neural machine translation. In Proceedings of the Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; 2021; pp. 3964–3974. [Google Scholar]
- Shao, C.; Feng, Y. Overcoming catastrophic forgetting beyond continual learning: Balanced training for neural machine translation. arXiv 2022, arXiv:2203.03910. [Google Scholar] [CrossRef]
- Qin, Y.; Zhang, J.; Lin, Y.; Liu, Z.; Li, P.; Sun, M.; Zhou, J. Elle: Efficient lifelong pre-training for emerging data. arXiv 2022, arXiv:2203.06311. [Google Scholar] [CrossRef]
- Huang, Y.; Zhang, Y.; Chen, J.; Wang, X.; Yang, D. Continual learning for text classification with information disentanglement based regularization. arXiv 2021, arXiv:2104.05489. [Google Scholar] [CrossRef]
- de Masson D’Autume, C.; Ruder, S.; Kong, L.; Yogatama, D. Episodic memory in lifelong language learning. Adv. Neural Inf. Process. Syst. 2019, 32. [Google Scholar]
- Wang, Z.; Mehta, S.V.; Póczos, B.; Carbonell, J. Efficient meta lifelong-learning with limited memory. arXiv 2020, arXiv:2010.02500. [Google Scholar] [CrossRef]
- Xu, K.; Verma, S.; Finn, C.; Levine, S. Continual learning of control primitives: Skill discovery via reset-games. Adv. Neural Inf. Process. Syst. 2020, 33, 4999–5010. [Google Scholar]
- Mi, F.; Chen, L.; Zhao, M.; Huang, M.; Faltings, B. Continual learning for natural language generation in task-oriented dialog systems. arXiv 2020, arXiv:2010.00910. [Google Scholar] [CrossRef]
- Li, Z.; Qu, L.; Haffari, G. Total recall: a customized continual learning method for neural semantic parsers. arXiv 2021, arXiv:2109.05186. [Google Scholar] [CrossRef]
- Sun, F.K.; Ho, C.H.; Lee, H.Y. Lamol: Language modeling for lifelong language learning. arXiv 2019, arXiv:1909.03329. [Google Scholar] [CrossRef]
- Zhang, Y.; Wang, X.; Yang, D. Continual sequence generation with adaptive compositional modules. arXiv 2022, arXiv:2203.10652. [Google Scholar] [CrossRef]
- Wang, R.; Yu, T.; Zhao, H.; Kim, S.; Mitra, S.; Zhang, R.; Henao, R. Few-shot class-incremental learning for named entity recognition. Proceedings of the Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 2022, Volume 1, 571–582. [Google Scholar]
- Geng, B.; Yuan, F.; Xu, Q.; Shen, Y.; Xu, R.; Yang, M. Continual learning for task-oriented dialogue system with iterative network pruning, expanding and masking. arXiv 2021, arXiv:2107.08173. [Google Scholar] [CrossRef]
- Shen, Y.; Zeng, X.; Jin, H. A progressive model to enable continual learning for semantic slot filling. In Proceedings of the Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP); 2019; pp. 1279–1284. [Google Scholar]
- Wang, C.; Pan, H.; Liu, Y.; Chen, K.; Qiu, M.; Zhou, W.; Huang, J.; Chen, H.; Lin, W.; Cai, D. Mell: Large-scale extensible user intent classification for dialogue systems with meta lifelong learning. In Proceedings of the Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining; 2021; pp. 3649–3659. [Google Scholar]
- Wu, T.; Li, X.; Li, Y.F.; Haffari, G.; Qi, G.; Zhu, Y.; Xu, G. Curriculum-meta learning for order-robust continual relation extraction. Proc. Proc. AAAI Conf. Artif. Intell. 2021, Vol. 35, 10363–10369. [Google Scholar] [CrossRef]
- Madotto, A.; Lin, Z.; Zhou, Z.; Moon, S.; Crook, P.; Liu, B.; Yu, Z.; Cho, E.; Wang, Z. Continual learning in task-oriented dialogue systems. arXiv 2020, arXiv:2012.15504. [Google Scholar] [CrossRef]
- Ermis, B.; Zappella, G.; Wistuba, M.; Rawal, A.; Archambeau, C. Memory efficient continual learning with transformers. Adv. Neural Inf. Process. Syst. 2022, 35, 10629–10642. [Google Scholar]
- Zhu, Q.; Li, B.; Mi, F.; Zhu, X.; Huang, M. Continual prompt tuning for dialog state tracking. arXiv 2022, arXiv:2203.06654. [Google Scholar] [CrossRef]
- Liu, M.; Chang, S.; Huang, L. Incremental prompting: Episodic memory prompt for lifelong event detection. arXiv 2022, arXiv:2204.07275. [Google Scholar] [CrossRef]
- Yin, W.; Li, J.; Xiong, C. Contintin: Continual learning from task instructions. arXiv 2022, arXiv:2203.08512. [Google Scholar]
- Xia, C.; Yin, W.; Feng, Y.; Yu, P. Incremental few-shot text classification with multi-round new classes: Formulation, dataset and system. arXiv 2021, arXiv:2104.11882. [Google Scholar]
- Wang, L.; Xie, J.; Zhang, X.; Huang, M.; Su, H.; Zhu, J. Hierarchical decomposition of prompt-based continual learning: Rethinking obscured sub-optimality. Adv. Neural Inf. Process. Syst. 2023, 36, 69054–69076. [Google Scholar]
- Wang, Z.; Zhang, Z.; Lee, C.Y.; Zhang, H.; Sun, R.; Ren, X.; Su, G.; Perot, V.; Dy, J.; Pfister, T. Learning to prompt for continual learning. In Proceedings of the Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2022; pp. 139–149. [Google Scholar]
- Geishauser, C.; van Niekerk, C.; Lubis, N.; Heck, M.; Lin, H.C.; Feng, S.; Gašić, M. Dynamic dialogue policy for continual reinforcement learning. arXiv 2022, arXiv:2204.05928. [Google Scholar] [CrossRef]
- Wang, W.; Zhang, J.; Li, Q.; Hwang, M.Y.; Zong, C.; Li, Z. Incremental learning from scratch for task-oriented dialogue systems. arXiv 2019, arXiv:1906.04991. [Google Scholar] [CrossRef]
- Pasunuru, R.; Stoyanov, V.; Bansal, M. Continual few-shot learning for text classification. In Proceedings of the Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing; 2021; pp. 5688–5702. [Google Scholar]
- Qin, C.; Joty, S. Continual few-shot relation learning via embedding space regularization and data augmentation. arXiv 2022, arXiv:2203.02135. [Google Scholar] [CrossRef]
- Ren, H.; Cai, Y.; Chen, X.; Wang, G.; Li, Q.; et al. A two-phase prototypical network model for incremental few-shot relation classification; Association for Computational Linguistics (ACL), 2020. [Google Scholar]
- Garcia, X.; Constant, N.; Parikh, A.P.; Firat, O. Towards continual learning for multilingual machine translation via vocabulary substitution. arXiv 2021, arXiv:2103.06799. [Google Scholar] [CrossRef]
- Gu, S.; Feng, Y. Investigating catastrophic forgetting during continual training for neural machine translation. arXiv 2020, arXiv:2011.00678. [Google Scholar] [CrossRef]
- Yan, S.; Hong, L.; Xu, H.; Han, J.; Tuytelaars, T.; Li, Z.; He, X. Generative negative text replay for continual vision-language pretraining. In Proceedings of the European Conference on Computer Vision; 2022; Springer; pp. 22–38. [Google Scholar]
- Greco, C.; Plank, B.; Fernández, R.; Bernardi, R. Psycholinguistics meets continual learning: Measuring catastrophic forgetting in visual question answering. arXiv 2019, arXiv:1906.04229. [Google Scholar] [CrossRef]
- Srinivasan, T.; Chang, T.Y.; Pinto Alva, L.; Chochlakis, G.; Rostami, M.; Thomason, J. Climb: A continual learning benchmark for vision-and-language tasks. Adv. Neural Inf. Process. Syst. 2022, 35, 29440–29453. [Google Scholar]
- Martínez-Plumed, F.; Ferri, C.; Hernández-Orallo, J.; Ramírez-Quintana, M.J. Forgetting and consolidation for incremental and cumulative knowledge acquisition systems. arXiv 2015, arXiv:1502.05615. [Google Scholar] [CrossRef]
- Christakopoulou, K.; Lalama, A.; Adams, C.; Qu, I.; Amir, Y.; Chucri, S.; Vollucci, P.; Soldo, F.; Bseiso, D.; Scodel, S.; et al. Large language models for user interest journeys. arXiv 2023, arXiv:2305.15498. [Google Scholar] [CrossRef]
- Wang, X.J.; Lee, C.P.; Mutlu, B. LearnMate: Enhancing Online Education with LLM-Powered Personalized Learning Plans and Support. In Proceedings of the Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems; 2025; pp. 1–10. [Google Scholar]
- Sabeima, M.; Lamolle, M.; Nanne, M.F. Towards personalized adaptive learning in e-learning recommender systems. Int. J. Adv. Comput. Sci. Appl. 2022, 13, 14–20. [Google Scholar] [CrossRef]
- Joy, J.; Raj, N.S.; VG, R. Ontology-based E-learning content recommender system for addressing the pure cold-start problem. ACM J. Data Inf. Qual. 2021, 13, 1–27. [Google Scholar] [CrossRef]
- Liu, Z.; Wang, Y.; Vaidya, S.; Ruehle, F.; Halverson, J.; Soljačić, M.; Hou, T.Y.; Tegmark, M. Kan: Kolmogorov-arnold networks. arXiv 2024, arXiv:2404.19756. [Google Scholar]
- Bountouni, N.; Koussouris, S.; Vasileiou, A.; Kazazis, S.A. A Holistic Framework for Safeguarding of SMEs: A Case Study. In Proceedings of the 2023 19th International Conference on the Design of Reliable Communication Networks (DRCN); IEEE, 2023; pp. 1–5. [Google Scholar]
- Asmar, M.; Tuqan, A. Integrating machine learning for sustaining cybersecurity in digital banks. Heliyon 2024, 10. [Google Scholar] [CrossRef]
- Ahmed, U.; Nazir, M.; Sarwar, A.; Ali, T.; Aggoune, E.H.M.; Shahzad, T.; Khan, M.A. Signature-based intrusion detection using machine learning and deep learning approaches empowered with fuzzy clustering. Sci. Rep. 2025, 15, 1726. [Google Scholar]
- Dohare, S.; Hernandez-Garcia, J.F.; Lan, Q.; Rahman, P.; Mahmood, A.R.; Sutton, R.S. Loss of plasticity in deep continual learning. Nature 2024, 632, 768–774. [Google Scholar] [CrossRef]
- Mohammed, K. Harnessing the Speed and Accuracy of Machine Learning to Advance Cybersecurity. arXiv 2023, arXiv:2302.12415. [Google Scholar]
- Rahul-Vigneswaran, K.; Poornachandran, P.; Soman, K. A compendium on network and host based intrusion detection systems. In Proceedings of the ICDSMLA 2019: Proceedings of the 1st International Conference on Data Science, Machine Learning and Applications; 2020; Springer; pp. 23–30. [Google Scholar]
- Stokes, J.W.; Wang, D.; Marinescu, M.; Marino, M.; Bussone, B. Attack and defense of dynamic analysis-based, adversarial neural malware classification models. arXiv 2017, arXiv:1712.05919. [Google Scholar] [CrossRef]
- Sameen, M.; Han, K.; Hwang, S.O. PhishHaven—An efficient real-time AI phishing URLs detection system. Ieee Access 2020, 8, 83425–83443. [Google Scholar] [CrossRef]


| Survey | Year | Main Focus | CL Coverage | Modern Trends | Main Limitation |
|---|---|---|---|---|---|
| Wang et al.[13] | 2024 | General CL theory and methods | TIL, DIL, and CIL | Limited discussion of prompting and PEFT | Minimal focus on foundation-model adaptation and modern multimodal CL |
| Van de Ven et al.[21] | 2022 | Taxonomy of CL scenarios | TIL, DIL, and CIL | Does not cover recent CL trends | Primarily focused on conceptual categorization of CL settings |
| Bidaki et al.[22] | 2025 | Online CL | Streaming and online CL | Benchmark-oriented discussion | Narrow scope centered on online learning settings |
| Zhou et al.[23] | 2024 | Class-incremental learning | Mainly CIL | Limited multimodal and foundation-model discussion | Restricted primarily to CIL strategies and benchmarks |
| Wickramasinghe et al.[24] | 2023 | Overview of CL methods | General CL settings | Covers traditional CL methods | Limited synthesis of transformer- and prompt-based CL methods |
| This review | 2026 | Modern CL trends, evaluation gaps, and deployment challenges | TIL, DIL, CIL, online, multimodal, and federated CL | Prompt learning, PEFT, foundation models, and diffusion models | Discusses evaluation inconsistency, benchmark fragmentation, deployment challenges, and emerging large-scale CL directions |
| Feature | CL | Transfer Learning | MTL | Online Learning |
|---|---|---|---|---|
| Task Availability | Sequential | One-time transfer | Simultaneous | Single task |
| Focus | Learning without forgetting |
Knowledge transfer | Shared representation |
Incremental updates |
| Addresses Forgetting | Yes | No | No | No |
| Data Distribution | Non-stationary | Varies | Varies | Stationary |
| Scenario | Task Label Know |
Output Space |
Data Distribution |
Example Application |
|---|---|---|---|---|
| Task-Incremental | Yes | Varies | Changes | Multi-task NLP, robotics |
| Domain Incremental | NO | Same | Changes | Handwriting recognition, IoT Sensors |
| Class-Incremental | No | Expands | Changes | Image classification, object detection |
| Instance- Incremental |
N/A | Same | Same (new data) |
Spam filtering, online analytics |
| Unsupervised/Other | N/A | N/A | Changes | Clustering, RL in dynamic settings |
| Aspect | Details |
|---|---|
| Definition | Models learn a sequence of distinct tasks, with task identity provided during both training and inference. |
| Core Challenge | Maintaining task-specific performance without interference between tasks (catastrophic forgetting). |
| Inference Requirement | Task identity is known, allowing the model to use task-specific components (e.g., separate output heads). |
| Key Techniques | - Task-specific output heads. - Parameter isolation (dedicated parameters for each task. - Regularization to preserve important parameters. |
| Advantages | - Robust retention of task-specific knowledge. - Simplified learning due to known task boundaries and identities. |
| Challenges | - Scalability issues with a growing number of tasks. - Limited knowledge transfer between tasks. |
| Example Applications | - Sequential learning of different object categories (e.g., animals, vehicles). - Robotics: learning distinct tasks like grasping and navigation. - Diagnostic systems for different modalities (e.g., X-rays, MRIs). |
| Evaluation Metrics | - Task-specific accuracy. - Memory and computational efficiency for handling multiple tasks. |
| Future Directions | - Modular architectures to balance task isolation and scalability. - Approaches to enable knowledge transfer across tasks. |
| Aspect | Details |
|---|---|
| Definition | Models learn to adapt to new data distributions (domains) over time while maintaining the same task objective. |
| Core Challenge | Adapting to new domains without forgetting knowledge of previously learned domains (catastrophic forgetting). |
| Inference Requirement | Task identity is unknown; the model must generalize across domains without explicit domain information. |
| Key Techniques | - Domain adaptation methods (e.g., feature alignment). - Regularization techniques to retain domain-invariant features - Memory replay or dynamic models to balance old and new knowledge. |
| Advantages | - Allows systems to handle non-stationary data distributions. - Maintains consistent tasks performance across multiple domains. |
| Challenges | - Catastrophic forgetting when adapting to new domains. - Handling domain-specific biases while ensuring generalization. - Computational and memory constraints as new domains increase. |
| Example Applications | - Object recognition in different environmental conditions (e.g., sunny, foggy, rainy). - Medical imaging systems adapting to scans from different hospitals or devices. - NLP tasks such as sentiment analysis across different domains (e.g., movie reviews, product reviews). |
| Evaluation Metrics | - Performance consistency across domains. -Forgetting rate for previously learned domains. - Domain generalization ability on unseen domains. |
| Future Directions | - Efficient methods for domain adaptation without overfitting to new domains. - Scalable approaches to handle increasing numbers of domains. - Techniques to balance domain-specific and domain-invariant learning. |
| Aspect | Details |
|---|---|
| Definition | Models learn new classes sequentially, and the task identity is not provided during inference. |
| Core Challenge | Catastrophic forgetting-new learning overwrites knowledge of previously learned classes. |
| Inference Requirement | Model must classify inputs across all learned classes without knowledge of task identity. |
| Key Techniques | - Memory replay (storing/replaying previous class examples) - KD (preserving learned representations) - Dynamic architecture (expanding capacity for new classes) |
| Advantages | - Enables incremental learning without full retraining. - Efficient handling of scenarios where new class data is available over time. |
| Challenges | - Handling class imbalance, as new classes often have fewer examples. - Managing memory and computational costs as the number of classes increases. |
| Example Applications | - Extending image classifiers with new object categories. - Autonomous vehicles learning new traffic signs and objects. - Healthcare models adapting to diagnose new diseases. |
| Evaluation Metrics | - Accuracy across all classes (old and new). - Forgetting rate (performance drop on previously learned classes). |
| Future Directions | - Scalable memory-efficient replay methods. - Adaptive architectures that balance stability and plasticity. - Improved algorithms for mitigating class imbalance and preserving older class knowledge. |
| Aspect | Details |
|---|---|
| Definition | Models learn incrementally from a stream of data instances, which may belong to existing or new classes, without explicit task boundaries. |
| Core Challenge | Adapting to new data while retaining knowledge of previously learned data, especially without clear transitions or task identities. |
| Inference Requirement | The model must classify instances across all learned classes without explicit knowledge of when new data or classes were introduced. |
| Key Techniques | - Memory replay (storing or generating past data). - Regularization techniques to preserve critical parameters. - Dynamic architectures for flexible capacity adjustment. |
| Advantages | - Handles continuously evolving data streams. - Allows for learning without task-specific information or retraining. |
| Challenges | - Managing catastrophic forgetting as new data arrives. - Handling class imbalance and unstructured data streams. - Resource efficiency for memory and computational costs. |
| Example applications | - Object recognition systems that adapt to new categories dynamically. - Recommendation systems updating preferences with new user data and items. - Continuous monitoring systems in healthcare, incorporating evolving signals from wearable devices. |
| Evaluation Metrics | - Accuracy across all classes (old and new). - Forgetting rate (performance drop on previously learned data. - Adaptation speed to new data. |
| Future directions | - Hybrid methods combining memory replay with adaptive architectures. - Scalable solutions for handling large and imbalanced data streams. - Techniques for efficient data prioritization and representation learning. |
| Paradigm | Description | Key Challenges | Key Techniques | Example Applications |
|---|---|---|---|---|
|
Few-Shot CL |
Models learn new tasks or classes with minimal labeled data while retaining prior knowledge. |
- Adapting with limited data. |
- Meta-learning. | - Rare disease diagnosis. |
| - Avoiding catastrophic forgetting. |
- Episodic memory. |
- Few-shot object recognition. |
||
| - Generative replay. |
||||
|
Unsupervised CL |
Models learn from data streams without explicit labels by discovering patterns or structures. |
- Extracting meaningful features from unlabeled data. |
- Self- supervised learning. |
- Video surveillance anomaly detection. |
| - Balancing old and new pattern representations. |
- Contrastive learning. |
- Social media trend analysis. |
||
| - Clustering methods. |
||||
|
Meta-Continual Learning |
Combines meta-learning with CL to enable rapid adaptation to new tasks. |
- Balancing fast adaptation with knowledge retention. |
- Gradient- based meta- learning. |
- Personalized AI assistants. |
| - Stability-plasticity tradeoff. |
- Memory- augmented neural networks. |
- Adaptive recommendation systems. |
||
|
Federated CL |
Models learn incrementally across distributed nodes while preserving privacy. |
- Handling heterogeneous data distributions across nodes. |
- Decentralized learning algorithms. |
- Personalized healthcare monitoring. |
| - Avoiding forgetting across distributed devices. |
- Secure aggregation protocols. |
- Mobile device personalization. |
||
| - Privacy concerns. | - Adaptive synchronization methods. |
|||
|
Multi-Agent CL |
Multiple agents learn and adapt in a shared environment while interacting and collaborating. |
- Coordinating knowledge transfer between agents. |
- Communication protocols. |
- Collaborative robotics. |
| - Managing inter-agent dependencies and scalability. |
- Shared memory systems. |
- Distributed sensor networks. |
||
| - Ensemble learning. |
| Concept | Core Idea | Main Challenge | Representative Strategies |
|---|---|---|---|
| Stability-Plasticity Dilemma | Balancing retention of prior knowledge with adaptation to new information | Excessive stability limits adaptation, while excessive plasticity causes forgetting | Regularization, replay mechanisms, adaptive architectures |
| Catastrophic Forgetting | Learning new tasks degrades performance on earlier tasks | Parameter interference and overlapping representations | Replay methods, parameter isolation, knowledge distillation, regularization |
| Forward and Backward Transfer | Leveraging previous knowledge to improve future learning and vice versa | Avoiding negative transfer across tasks | Shared representations, multi-task learning, transferable feature learning |
| Representation Learning | Learning reusable and task-invariant feature representations | Separating task-specific and generalizable features | Self-supervised learning, contrastive learning, feature disentanglement |
| Neuroscientific Inspiration | Drawing inspiration from biological memory and adaptation mechanisms | Translating biological principles into scalable AI systems | Synaptic consolidation, rehearsal mechanisms, dynamic expansion |
| Practical and Ethical Considerations | Ensuring reliable and responsible continual adaptation | Resource constraints, fairness, privacy, and safety | Lightweight models, federated learning, fairness-aware training |
| Aspect | Description |
|---|---|
| Definition | The significant loss of performance on previously learned tasks when a neural network learns new tasks. |
| Cause | Overwriting of neural network parameters due to global updates during training on new tasks. |
| Key Mechanism | Parameter Drift: Critical parameters for previous tasks are modified to optimize new task learning. |
|
Factors Exacerbating Forgetting |
- Overlapping representations shared by different tasks. |
| - Sequential data access without revisiting earlier tasks. |
|
| - Lack of task awareness during inference in class/domain-incremental settings. |
|
| Examples | - A model trained to classify animals forgetting how to classify vehicles after learning new classes. |
| - An object detection model in autonomous driving failing to recognize stop signs after adapting to new road signs. |
| Mitigation Strategy | Description | Examples |
|---|---|---|
|
Regularization Methods |
Introduce constraints during training to prevent significant updates to parameters crucial for earlier tasks. |
- EWC: Penalizes parameter changes. |
| - Synaptic Intelligence: Tracks parameter importance. |
||
| Replay-Based Methods | Retain and replay data from previous tasks during training on new tasks. |
- Experience Replay: Stores a subset of prior task data. |
| - Generative Replay: Generates synthetic data from past tasks. |
||
| Dynamic Architectures | Expand or adapt the network architecture to allocate new resources for each task. |
- Progressive Neural Networks: Adds new parameters per task. |
| - Dynamically expandable networks. |
||
| Representation Learning | Learn generalizable features that can be reused across tasks, reducing task-specific interference. |
- Self-supervised pretraining. |
| - Disentangled representations. | ||
| Hybrid Approaches | Combine multiple strategies, such as regularization with replay or dynamic architectures. |
- Replay with EWC to balance plasticity and stability. |
| Evaluation Metrics | Description | |
| Forgetting Rate | Measures the drop in performance on previously learned tasks after learning new ones. | |
| Accuracy | Assesses performance across all tasks (old and new). | |
| Knowledge Transfer | Evaluates how well the model uses previous knowledge to improve learning on new tasks. | |
| Method Category | Key Characteristics | Main Challenges | Memory Cost | Typical Applications |
|---|---|---|---|---|
| Regularization-Based Methods | Constrain parameter updates to preserve previous knowledge; memory efficient and easy to integrate | Limited performance under severe domain shifts and long task sequences | Low | Task-incremental learning, resource-constrained systems, privacy-sensitive applications |
| Replay-Based Methods | Replay stored or generated samples to reinforce previous knowledge; strong retention performance | Replay buffer management, privacy concerns, and storage overhead | Moderate-High | Class-incremental learning, reinforcement learning, streaming adaptation |
| Architecture-Based Methods | Allocate task-specific modules or expandable subnetworks to reduce interference | Poor scalability due to parameter growth and increasing model complexity | High | Task-incremental learning with explicit task boundaries |
| Optimization-Based Methods | Modify gradient updates to balance stability and plasticity during training | High computational complexity and optimization overhead | Moderate | Gradient-constrained continual adaptation and stability-focused learning |
| Representation-Learning Methods | Learn transferable and domain-invariant feature representations across tasks | Representation drift under highly heterogeneous task distributions | Low-Moderate | Domain-incremental learning and self-supervised continual adaptation |
| Prompt-Based and PEFT Methods | Adapt pretrained foundation models using prompts, adapters, or low-rank updates | Prompt interference, adapter scalability, and long-term stability | Low | Foundation models, multimodal systems, large-scale deployment |
| Federated and Privacy-Aware CL | Enable CL across distributed clients without centralized data sharing | Client drift, communication overhead, and heterogeneous data distributions | Moderate | Healthcare, finance, edge AI, and mobile systems |
| Application Area | Description | Key Benefits | Examples |
|---|---|---|---|
|
Healthcare and Medical Imaging |
Enables dynamic adaptation to evolving medical knowledge, diseases, and patient data over time. |
Personalized diagnostics, improved adaptability, and long-term patient monitoring. |
Radiology systems adapting to new imaging techniques or emerging diseases like novel cancer types. |
|
Robotics and Autonomous Systems |
Allows robots and autonomous systems to learn new tasks, adapt to dynamic environments, and retain prior knowledge. |
Efficient task performance, knowledge transfer, and adaptability in real-world scenarios. |
Household robots learning new cleaning techniques while retaining old capabilities like object recognition. |
|
Natural Language Processing (NLP) |
Helps models stay updated with evolving language patterns, domain-specific knowledge, and user preferences. |
Better understanding of new language constructs, improved domain adaptation, and enhanced usability. |
Chatbots adapting to new slang or technical jargon while maintaining general conversational abilities. |
|
Recommender Systems |
Adapts to changing user preferences and updates content or product catalogs dynamically. |
Improved user engagement, personalized recommendations, and scalability for diverse user bases. |
Streaming platforms suggesting trending shows based on current preferences without forgetting past ones. |
| Cybersecurity | Learns from new attack patterns and threat vectors while retaining the ability to recognize older threats. |
Improved security, real-time threat detection, and reduced vulnerability to emerging cyberattacks. |
Intrusion detection systems identifying novel malware while protecting against traditional viruses. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).